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Continued Examination Under 37 CFR 1.114 

In view of the Appeal Brief filed on April 10, 2008, PROSECUTION IS HEREBY 
REOPENED. The new grounds of rejection are set forth below. 

To avoid abandonment of the application, appellant must exercise one of the 
following two options: 

(1 ) file a reply under 37 CFR 1.111 (if this Office action is non-final) or a reply 
under 37 CFR 1 .1 1 3 (if this Office action is final); or, 

(2) initiate a new appeal by filing a notice of appeal under 37 CFR 41 .31 followed 
by an appeal brief under 37 CFR 41 .37. The previously paid notice of appeal fee and 
appeal brief fee can be applied to the new appeal. If, however, the appeal fees set forth 
in 37 CFR 41 .20 have been increased since they were previously paid, then appellant 
must pay the difference between the increased fees and the amount previously paid. 

A Supervisory Patent Examiner (SPE) has approved of reopening prosecution by 
signing below:/Richemond Dorvil/ 

Supervisory Patent Examiner, Art Unit 2626 

1 . Applicant's arguments with respect to Shitotani and Schulz have been 
considered but are moot in view of the new ground(s) of rejection. 
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Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claims 1-11, 13-31, 33-38, 40, and 44 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Foster ("Target-Text Mediated Interactive Machine 
Translation" Machine Translation, 1997) in view of Schulz (6,360,237). 

2. As per claims 1 and 20, Foster discloses a method and system for facilitating 
translation of an audio signal that includes speech to another language, comprising: 

Retrieving a textual representation (page 179, section 3, first paragraph, the 
translator selects text, therefore a textual representation must have been retrieved); 

Presenting the textual representation to a user (page 179, section 3, first 
paragraph, the translator selects text, therefore a textual representation must have been 
presented to the user); 

Receiving selection of a segment of the textual representation for translation 
(page 179, section 3, first paragraph, the translator selects a portion of the source text, 
usually a sentence, for translation); 
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Receiving translation made by the user (page 179, section 3, first paragraph, the 
translator selects a portion of the source text, usually a sentence, and types in the 
translation). 

Foster does not disclose retrieving a textual representation of an audio signal, 
obtaining a portion of the audio signal corresponding to the segment of the textual 
representation, providing the segment of the textual representation and the portion of 
the audio signal to the user, and receiving a translation made by the user of the portion 
of the audio signal. Rather, as noted above, Foster discloses human translation of text, 
without providing specifics as to where the text came from. However, speech 
recognition systems are commonly used to convert speech to text, as indicated in 
Schulz (column 1 lines 27-34, speech recognition is used for transcription). Schulz also 
discloses a system that synchronizes text with a specific spoken word during playback 
of an audio file (column 5 lines 30-33). In Schulz, a text editor is used that automatically 
aligns a cursor in the written text on a screen with a specific spoken word during 
playback of an audio file. All of the elements of claims 1 and 20 are known in references 
Foster and Schulz, the only difference is their combination for use in a translation 
system. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use known methods to retrieve a textual representation of an audio 
signal for translation in Foster, since it would provide automatic transcription, saving 
transcription costs (Schulz, column 1 lines 27-34), while enabling a user to provide fast 
and accurate translation of speech data. 
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It would also have been obvious to one of ordinary skill in the art at the time of 
the invention to combine the known elements of audio and text synchronization with 
Foster, since the combination would produce the predictable result of enabling the user 
to quickly and easily translate and edit text displayed on the monitor, including 
identifying and correcting errors, without interruption during playback of the speech from 
an audio recording, as indicated in Schulz (column 5 lines 55-58). 

3. As per claim 21 , Foster discloses a translation system, comprising: 

Obtaining a textual representation (page 179, section 3, first paragraph, the 
translator selects text, therefore a textual representation must have been retrieved); 

Presenting the transcription to a user (page 179, section 3, first paragraph, the 
translator selects text, therefore a textual representation must have been retrieved); 

Receiving selection of a portion of the transcription for translation (page 179, 
section 3, first paragraph, the translator selects text, therefore a textual representation 
must have been retrieved); 

Receive from the user a translation made by the user of the portion of the audio 
signal (page 179, section 3, first paragraph, the translator selects a portion of the source 
text, usually a sentence, and types in the translation). 

Foster does not disclose a memory configured to store instructions, and a 
processor configured to execute the instructions in memory to perform the 
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aforementioned steps as well as obtain a transcription of an audio signal that includes 
speech, retrieve a portion of the audio signal corresponding to the portion of the 
transcription, and provide the portion of the transcription and the portion of the audio 
signal to the user. However, Foster discloses a system for Interactive Machine 
Translation, where the user provides a translation of the source data using a machine 
translation system as a resource. The use of the MT system suggests the use of a 
computer, including memory and a processor configured to execute instructions from 
memory. Additionally, speech recognition systems are commonly used to convert 
speech to text, as indicated in Schulz (column 1 lines 27-34, speech recognition is used 
for transcription). Schulz also discloses a system that synchronizes text with a specific 
spoken word during playback of an audio file (column 5 lines 30-33). In Schulz, a text 
editor is used that automatically aligns a cursor in the written text on a screen with a 
specific spoken word during playback of an audio file. All of the elements of claim 21 are 
known in references Foster and Schulz, the only difference is their combination for use 
in a translation system. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have a memory and processor configured to execute the instructions 
stored in memory in Foster, since a computer system can perform calculations and 
execute instructions extremely quickly, thus decreasing processing time and enabling a 
real-time application. 

It would also have been obvious to one of ordinary skill in the art at the time of 
the invention to use known methods to retrieve a textual representation of an audio 
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signal for translation in Foster, since it would provide automatic transcription, saving 
transcription costs (Schulz, column 1 lines 27-34), while enabling a user to provide fast 
and accurate translation of speech data. 

It would also have been obvious to one of ordinary skill in the art at the time of 
the invention to combine the known elements of audio and text synchronization with 
Foster, since the combination would produce the predictable result of enabling the user 
to quickly and easily translate and edit text displayed on the monitor, including 
identifying and correcting errors, without interruption during playback of the speech from 
an audio recording, as indicated in Schulz (column 5 lines 55-58). 

4. As per claims 2 and 22, Foster in view of Schulz disclose the method and 
system of claims 1 and 21 , but Foster does not explicitly disclose wherein the retrieving 
a textual representation includes generating a request for information, sending the 
request to a server, and obtaining, from the server, at least the textual representation of 
the audio signal. However, Foster discloses a system for Interactive Machine 
Translation, where the user provides a translation of the source data, displayed as text, 
using a machine translation system as a resource. The use of the MT system suggests 
the use of a computer, including memory and a processor configured to execute 
instructions from memory. In addition, in any computer system software instructions, for 
example function calls, are executed in order to retrieve data from memory, such as a 
server, for further processing. 
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Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to apply the known technique of sending a request for information to a 
server and obtain a textual representation of the audio signal in Foster, since it would 
enable the user to process information previously stored in memory. 

5. As per claims 3 and 23, Foster in view of Schulz disclose the method and 
system of claims 1 and 21, and Schulz further discloses wherein the presenting the 
textual representation to a user, includes: obtaining the audio signal, providing the audio 
signal and the textual representation of the audio signal to the user, and visually 
synchronizing the providing of the audio signal with the textual representation of the 
audio signal (column 5 lines 30-33 and column 6 lines 29-30, the audio signal is 
provided the user, synchronized with the test. Therefore the audio signal must have first 
been obtained). Schulz discloses a system that synchronizes text with a specific 
spoken word during playback of an audio file (column 5 lines 30-33). In Schulz, a text 
editor is used that automatically aligns a cursor in the written text on a screen with a 
specific spoken word during playback of an audio file. All of the elements of claims 3 
and 23 are known in the references Foster and Schulz, the only difference is their 
combination for use in a translation system. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to combine the known elements of audio and text synchronization with 
Foster, since the combination would produce the predictable result of enabling the user 
to quickly and easily translate and edit text displayed on the monitor, including 
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identifying and correcting errors, without interruption during playback of the speech from 
an audio recording, as indicated in Schulz (column 5 lines 55-58). 

6. As per claims 4 and 24, Foster in view of Schulz disclose the method and 
system of claims 3 and 23, and Schulz further discloses wherein the obtaining the 
audio signal includes accessing a database of original media to retrieve the audio signal 
(column 5 lines 30-33, the audio recording is played back and aligned with the words on 
the screen. The audio played back is from an audio recording; therefore the audio must 
have been accessed from a recording medium or memory, such as a database). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to access a database of original media to retrieve the audio signal in 
Foster, since it would enable the user to process information previously stored in the 
database. 

7. As per claims 5,8,25, and 28 Foster in view of Schulz disclose the method and 
system of claims 3,1 ,23 and 21 , and Schulz further discloses wherein the obtaining the 
audio signal includes receiving input, from the user, regarding a desire for the audio 
signal (column 12 line 63-column 13 line 12, if the user enters a command to start 
playback of the audio signal, the playback edit function mode is entered, otherwise the 
system enters the standard editing mode) initiating a media player, and using the media 
player to obtain the audio signal (column 12 line 63-column 13 line 12, if the user enters 
a command to start playback of the audio signal, the playback edit function mode is 
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entered and playback of the audio recording synchronized with the text begins. Since 
the audio, a type of media, is output, it must be have been obtained and output through 
a media player). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to receive input, from the user, regarding a desire for the audio signal, 
initialize a media player, and use the media player to obtain the audio signal in Foster, 
since it would enable the user to quickly and easily translate and edit text displayed on 
the monitor, including identifying and correcting errors, without interruption during 
playback of the speech from an audio recording, as indicated in Schulz (column 5 lines 
55-58). 

8. As per claims 6 and 26, Foster in view of Schulz disclose the method and 
system of claims 1 and 21 , but Foster does not explicitly disclose wherein the receiving 
selection of a segment of the textual representation includes identifying a portion of the 
textual representation selected by the user, accessing a server to obtain text 
corresponding to the portion of the textual representation, and receiving, from the 
server, the text corresponding to the portion of the textual representation. However, 
Foster discloses a system for Interactive Machine Translation, where the user provides 
a translation of the source data using a machine translation system as a resource. The 
use of the MT system suggests the use of a computer, including memory and a 
processor configured to execute instructions from memory. In addition, in any computer 
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system software instructions, for example function calls, are executed in order to 
retrieve data from memory, such as a server, for further processing. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to apply the known technique accessing and receiving text from a 
server in Foster, since it would enable the system to process information previously 
stored in memory. 

9. As per claims 7 and 27, Foster in view of Schulz disclose the method and 
system of claims 6 and 26, and Schulz further discloses wherein the text includes a 
transcription of the audio signal and metadata corresponding to the portion of the textual 
representation (column 4 lines 52-59, a file containing the transcription of the input 
speech also contains beginning and end times for each word and silent pauses). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have text file that includes a transcription and metadata in Foster, 
since it would enable the system to locate pauses, and suppress them during playback, 
as indicated in Schulz (column 4 lines 60-65). 

1 0. As per claims 9 and 29, Foster in view of Schulz disclose the method and 
system of claims 8 and 28, and Schulz further discloses wherein the using the media 
player includes identifying, by the media player, the segment of the textual 
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representation, and retrieving the portion of the audio signal corresponding to the 
segment of the textual representation (column 6 lines 18-30, the system uses the 
beginning and ending times of words to align the cursor on the monitor with a particular 
displayed word during playback of the audio recording. Since the audio is played back 
synchronized with the time information from the text file, a media player must have 
identified the textual representation and retrieved the audio signal). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to identify, by the media player, the segment of the textual 
representation, and retrieve the portion of the audio signal corresponding to the 
segment of the textual representation in Foster, since it would enable the user to 
quickly and easily translate and edit text displayed on the monitor, including identifying 
and correcting errors, without interruption during playback of the speech from an audio 
recording, as indicated in Schulz (column 5 lines 55-58). 

11. As per claims 1 0,1 1 , 30 and 31 , Foster in view of Schulz disclose the method 
and system of claims 9 and 29, and Schulz further discloses wherein the segment of 
the textual representation includes a starting position in the textual representation, and 
wherein the identifying the segment includes identifying a time codes associated with 
the beginning and ending of the textual representation (column 6 lines 18-30, the 
system uses the beginning and ending times of words to align the cursor on the monitor 
with a particular displayed word during playback of the audio recording). 
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Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have a textual representation that includes a starting position, and 
identify time codes associated with the beginning and end times of the textual 
representation in Foster, since it would enable the user to quickly and easily translate 
and edit text displayed on the monitor, including identifying and correcting errors, 
without interruption during playback of the speech from an audio recording, as indicated 
in Schulz (column 5 lines 55-58). 

12. As per claims 1 3 and 33, Foster in view of Schulz disclose the method and 
system of claims 1 and 21 , and Schulz further discloses wherein the providing the 
segment of the textual representation and the portion of the audio signal to the user 
includes visually synchronizing the providing of the portion of the audio signal with the 
segment of the textual representation (column 5 lines 30-33 and column 6 lines 29-30). 
Schulz discloses a system that synchronizes text with a specific spoken word during 
playback of an audio file (column 5 lines 30-33). In Schulz, a text editor is used that 
automatically aligns a cursor in the written text on the screen with a specific spoken 
word during playback of the audio file. All of the elements of claims 13 and 33 are 
known in the references Foster and Schulz, the only difference is their combination for 
use in a translation system. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to combine the known elements of audio and text synchronization with 
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Foster, since the combination would produce the predictable result of enabling the user 
to quickly and easily translate and edit text displayed on the monitor, including 
identifying and correcting errors, without interruption during playback of the speech from 
an audio recording, as indicated in Schulz (column 5 lines 55-58). 

1 3. As per claims 1 4 and 34, Foster in view of Schulz disclose the method and 
system of claims 1 3 and 33, and Schulz further discloses wherein the segment of the 
textual representation includes time codes corresponding to when words in the textual 
representation were spoken (column 4 lines 52-59, a file containing the transcription of 
the input speech also contains beginning and end times for each word and silent 
pauses). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have a textual representation that includes time codes corresponding 
to when words in the textual representation were spoken in Foster, since it would 
enable the system to locate pauses, and suppress them during playback, as indicated in 
Schulz (column 4 lines 60-65). 

14. As per claims 1 5 and 35, Foster in view of Schulz disclose the method and 
system of claims 14 and 34, and Schulz further discloses wherein the visually 
synchronizing the providing of the portion of the audio signal with the segment of the 
textual representation includes comparing times corresponding to the providing of the 
portion of the audio signal to the time codes from the segment of the textual 
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representation, and visually distinguishing words in the segment of the textual 
representation when the words are spoken during the providing of the portion of the 
audio signal (column 6 lines 18-30). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to compare times corresponding to the providing of the portion of the 
audio signal to the time codes from the segment of the textual representation, and 
visually distinguishing words in the segment of the textual representation when the 
words are spoken during the providing of the portion of the audio signal in Foster, since 
it would enable the user to quickly and easily translate and edit text displayed on the 
monitor, including identifying and correcting errors, without interruption during playback 
of the speech from an audio recording, as indicated in Schulz (column 5 lines 55-58). 

15. As per claims 16,17,36 and 37, Foster in view of Schulz disclose the method of 
claims 1 and 21 , and Schulz further discloses wherein the providing the segment of the 
textual representation and the portion of the audio signal to the user includes permitting 
the user to control the providing of the portion of the audio signal by allowing the user to 
at least one of fast forward, speed up, slow down, and back up the providing of the 
portion of the audio signal using foot pedals (column 2 lines 29-34). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to control the providing of the portion of the audio signal by allowing the 
user to at least one of fast forward, speed up, slow down, and back up the providing of 
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the portion of the audio signal using foot pedals in Foster, since it would enable the 
user to control playback of the audio file, thus and quickly and efficiently process the 
source data into target data. 

16. As per claims 1 8 and 38, Foster in view of Schulz disclose the method of claims 
16 and 36, and Schulz further discloses wherein the permitting the user to control the 
providing of the portion of the audio signal includes permitting the user to rewind the 
portion of the audio signal at least one of a predetermined amount of time and a 
predetermined amount of words (column 2 line 29-34, the user can use keyboard input 
or a foot control to control the audio signal, including moving forward and rewinding). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to permit the user to rewind the portion of the audio signal at least one 
of a predetermined amount of time and a predetermined amount of words in Foster, 
since it would enable the user to control playback of the audio file, thus and quickly and 
efficiently process the source data into target data. 

17. As per claim 40, Foster discloses a graphical user interface, comprising: 

A text input section that includes text information in a first language (page 179, 
section 3, first paragraph, the translator selects text, therefore a textual representation 
must have been input); 
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A translation section that receives a translation made by the user into a second 
language (page 179, section 3, first paragraph, the translator selects a portion of the 
source text, usually a sentence, and types in the translation). 

Foster does not disclose a transcription section that includes a transcription of 
non-text information in a first language, a translation section that receives a translation 
made by the user of the non-text information, and a play button that, when selected, 
causes the retrieval of the non-text information to be initiated, playing of the non-text 
information, and the playing of the non-text information to be visually synchronized with 
the transcription in the transcription section. However, speech recognition systems are 
commonly used to convert speech to text, as indicated in Schulz (column 1 lines 27-34, 
speech recognition is used for transcription). Schulz also discloses a system that 
synchronizes text with a specific spoken word during playback of an audio file (column 5 
lines 30-33). In Schulz, a text editor is used that automatically aligns a cursor in the 
written text on the screen with a specific spoken word during playback of the audio file. 
All of the elements of claim 40 are known in references Foster and Schulz, the only 
difference is their combination for use in a translation system. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use known methods to retrieve a transcript of non-text information in 
a first language in Foster, since it would provide automatic transcription, saving 
transcription costs (Schulz, column 1 lines 27-34), while enabling a user to provide fast 
and accurate translation of speech data. 
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It would also have been obvious to one of ordinary skill in the art at the time of 
the invention to combine the known elements of audio and text synchronization with 
Foster, since the combination would produce the predictable result of enabling the user 
to quickly and easily translate and edit text displayed on the monitor, including 
identifying and correcting errors, without interruption during playback of the speech from 
an audio recording, as indicated in Schulz (column 5 lines 55-58). 

18. As per claim 44, Foster in view of Schulz disclose the graphical user interface of 
claim 40, and Schulz further discloses wherein the play button further causes words in 
the transcription to be visually distinguished in synchronism with the words in the non- 
text information being played (column 6 lines 18-30). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have a play button that causes words in the transcription to be 
visually distinguished in synchronism with the words in the non-text information being 
played in Foster, since it would enable the user to quickly and easily translate and edit 
text displayed on the monitor, including identifying and correcting errors, without 
interruption during playback of the speech from an audio recording, as indicated in 
Schulz (column 5 lines 55-58). 

1 9. As per claim 45, Foster in view of Schulz disclose the graphical user interface of 
claim 40, and Schulz further discloses wherein the non-text information includes at 
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least one of audio and video (column 4 lines 46-59, a speech recognition unit converts a 
recording of speech (audio non-text information) into a text file). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to process non-text information that includes at least one of audio and 
video in Foster, since it would enable the system to translate spoken language as well 
as textual documents. 

Claims 41 and 46 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Foster in view of Schulz as applied to claim 40 above, and further in view of 
Saindon (6,820,055). 

21 . Foster in view of Schulz disclose the graphical user interface of claim 40, 
however neither disclose wherein the transcription visually distinguishes names of 
people, places, and organizations and wherein the graphical user interface is 
associated with a word processing application. Saindon discloses a system for 
automated transcription and translation that processes text to visually distinguish the 
names of people, places and organizations using a word processor (column 16 lines 34- 
65, the system processes the text to determine if all proper nouns are capitalized using 
software such as Microsoft word). All of the elements of claims 41 and 46 are known in 
references Foster, Schulz, and Saindon the only difference is their combination for use 
in a translation system. 
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Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to apply the known technique of having a transcription that visually 
distinguishes names of people, places, and organizations and a graphical user interface 
is associated with a word processing application in Foster and Schulz, since it would 
enable the system to generate text that provides accurate translations, as indicated in 
Saindon (column 16 lines 38-40), using reliable commercially established software that 
is readily available. 

Claims are 12, 19, 32, 39, 42, 43, and 47 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Foster in view of Schulz as applied to claims 1 , 21 and 40 
above, and further in view of Shiotani (4,814,988). 

22. As per claims 1 2 and 32, Foster in view of Schulz disclose the method and 
system of claims 1 and 21 , however neither disclose wherein the providing the segment 
of the textual representation and the portion of the audio signal to the user includes 
displaying the segment of the textual representation in a same window as will be used 
by the user to provide the translation of the portion of the audio signal, including as a 
split screen in a translation window. Shiotani discloses wherein the providing the 
segment of the textual representation and the portion of the audio signal to the user 
includes displaying the segment of the textual representation in a same window as will 
be used by the user to provide the translation of the portion of the audio signal, 
including as a split screen in a translation window (column 2 lines 15-20 and Figure 4(a) 
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and 4(b)). Shiotani discloses a machine translation system where the source string and 
target string appear side-by-side in the same window. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to display the segment of the textual representation in a same window 
as will be used by the user to provide the translation of the portion of the audio signal, 
including as a split screen in a translation window in Foster and Schulz, since one of 
ordinary skill in the art has good reason to pursue the options within his or her technical 
grasp in order to achieve the predictable result of quickly and efficiently translating 
source information. 

23. As per claims 1 9 and 39, Foster in view of Schulz disclose the method of claims 
1 and 21 , however neither explicitly disclose publishing the translation to a user- 
determined location. However, Schulz does disclose a text editor used to synchronize 
text and audio information when editing the textual information (column 5 lines 30-33). 
In text editing software, such as Microsoft word or open office, the user has many 
options once a document is complete. It can either be saved to a file, transmitted over 
the internet, printed on a screen, sent to a printer, or a combination thereof. In addition, 
Shiotani discloses sending the translation to a CRT display (user-defined location) 
(column 3 lines 2-4). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to publish the translation to a user-determined location in Foster and 
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Schulz, since it would enable the user to 
output the translation fro current use. 
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the translation for use at a later time, or 



24. As per claims 42 and 43, Foster in view of Schulz disclose the graphical user 
interface of claim 40, but neither explicitly disclose a configuration button, that when 
selected, causes a window to be presented, the window permitting an amount of backup 
to be specified, the amount of backup including one of a predetermined amount of time 
and a predetermined number of words, and wherein the window further permits a name 
to be given for the translation and a location of publication to be specified. However, 
Shiotani does disclose a translation buffer for storing the result of translation of a 
selected portion of the input (column 2 lines 38-41). The translation buffer stores a 
predetermined number of words, i.e. the region of the text specified by the user and 
then translated. In addition, the use of a configuration button to present a window that 
permits a name to be given to a file and a location of publication to be specified is a 
feature of any text editing or word processing software, running on any of a number of 
operating systems, such as windows and Linux. The software enables the user to use 
the save button (configuration button), located under a file menu in a task bar, to choose 
a location in memory as well as a name for the file. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to apply the known technique of using a configuration button, that when 
selected, causes a window to be presented, the window permitting an amount of backup 
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to be specified, the amount of backup including one of a predetermined amount of time 
and a predetermined number of words, and wherein the window further permits a name 
to be given for the translation and a location of publication to be specified in Foster and 
Schulz, since it would enable the system to save the file in memory so that it can be 
easily retrieved for further processing in the future. 

25. As per claim 47, Foster discloses a method comprising: 

A user viewing a textual information in a first language (page 179, section 3, first 
paragraph, the translator selects text in a first language to be translated); 

Said user translating said information thereby obtaining a translation in a second 
language (page 179, section 3, first paragraph, the translator selects a portion of the 
source text, usually a sentence, and types in the translation). 

Foster does not disclose a user listening to an audio playback of information in a 
first language while viewing a textual transcription of said information in said first 
language on a transcription section of a graphical user interface (GUI), said textual 
transcription being synchronized with said audio playback, said user translating the 
audio playback of said information, said user using a different section of said graphical 
user interface (GUI) to display said translation while making said translation. However, 
speech recognition systems are commonly used to convert speech to text, as indicated 
in Schulz (column 1 lines 27-34, speech recognition is used for transcription). Schulz 
also discloses a system that synchronizes text with a specific spoken word during 
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playback of an audio file (column 5 lines 30-33). In Schulz, a text editor is used that 
automatically aligns a cursor in the written text on the screen with a specific spoken 
word during playback of the audio file. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to combine the known elements of audio and text synchronization with 
Foster, since the combination would produce the predictable result of enabling the user 
to quickly and easily translate and edit text displayed on the monitor, including 
identifying and correcting errors, without interruption during playback of the speech from 
an audio recording, as indicated in Schulz (column 5 lines 55-58). 

Additionally, Shiotani discloses displaying the segment of the textual 
representation in a same window as will be used by the user to provide the translation 
of the portion of the audio signal, including as a split screen in a translation window 
(column 2 lines 15-20 and Figure 4(a) and 4(b)). Shiotani discloses a machine 
translation system where the source string and target string appear side-by-side in the 
same window. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to display the segment of the textual representation in a same window 
as will be used by the user to provide the translation of the portion of the audio signal, 
including as a split screen in a translation window in Foster , since one of ordinary skill 
in the art has good reason to pursue the options within his or her technical grasp in 
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order to achieve the predictable result of quickly and efficiently translating source 
information. 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dorothy Sarah Siedler whose telephone number is 571- 
270-1067. The examiner can normally be reached on Mon-Thur 9:30am-5:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

DSS 

/Richemond Dorvil/ 
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Supervisory Patent Examiner, Art Unit 2626 



